predictive relationship
The Principles of Human-like Conscious Machine
Determining whether another system, biological or artificial, possesses phenomenal consciousness has long been a central challenge in consciousness studies. This attribution problem has become especially pressing with the rise of large language models and other advanced AI systems, where debates about "AI consciousness" implicitly rely on some criterion for deciding whether a given system is conscious. In this paper, we propose a substrate-independent, logically rigorous, and counterfeit-resistant sufficiency criterion for phenomenal consciousness. We argue that any machine satisfying this criterion should be regarded as conscious with at least the same level of confidence with which we attribute consciousness to other humans. Building on this criterion, we develop a formal framework and specify a set of operational principles that guide the design of systems capable of meeting the sufficiency condition. We further argue that machines engineered according to this framework can, in principle, realize phenomenal consciousness. As an initial validation, we show that humans themselves can be viewed as machines that satisfy this framework and its principles. If correct, this proposal carries significant implications for philosophy, cognitive science, and artificial intelligence. It offers an explanation for why certain qualia, such as the experience of red, are in principle irreducible to physical description, while simultaneously providing a general reinterpretation of human information processing. Moreover, it suggests a path toward a new paradigm of AI beyond current statistics-based approaches, potentially guiding the construction of genuinely human-like AI.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Chongqing Province > Chongqing (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Bootstrapped Control Limits for Score-Based Concept Drift Control Charts
Wu, Jiezhong, Apley, Daniel W.
Monitoring for changes in a predictive relationship represented by a fitted supervised learning model (aka concept drift detection) is a widespread problem, e.g., for retrospective analysis to determine whether the predictive relationship was stable over the training data, for prospective analysis to determine when it is time to update the predictive model, for quality control of processes whose behavior can be characterized by a predictive relationship, etc. A general and powerful Fisher score-based concept drift approach has recently been proposed, in which concept drift detection reduces to detecting changes in the mean of the model's score vector using a multivariate exponentially weighted moving average (MEWMA). To implement the approach, the initial data must be split into two subsets. The first subset serves as the training sample to which the model is fit, and the second subset serves as an out-of-sample test set from which the MEWMA control limit (CL) is determined. In this paper, we develop a novel bootstrap procedure for computing the CL. Our bootstrap CL provides much more accurate control of false-alarm rate, especially when the sample size and/or false-alarm rate is small. It also allows the entire initial sample to be used for training, resulting in a more accurate fitted supervised learning model. We show that a standard nested bootstrap (inner loop accounting for future data variability and outer loop accounting for training sample variability) substantially underestimates variability and develop a 632-like correction that appropriately accounts for this. We demonstrate the advantages with numerical examples.
Asymmetric predictive relationships across histone modifications - Nature Machine Intelligence
Decoding the epigenomic landscapes in diverse tissues and cell types is fundamental to understanding molecular mechanisms underlying many essential cellular processes and human diseases. Recent advances in artificial intelligence provide new methods and strategies for imputing unknown epigenomes based on existing data, yet how to reveal the predictive relationships among epigenetic marks remains largely unexplored. Here we present a machine learning approach for epigenomic imputation and interpretation. Through dissection of the spatial contributions from six histone marks, we reveal the prevalent and asymmetric cross-prediction relationships among these marks. Meanwhile, our approach achieved high predictive performance on held-out prospective epigenomes and outperformed the state of the art. To facilitate future research, we further applied this approach to impute a total of 527 and 2,455 unavailable genome-wide histone modification signal tracks for the ENCODE3 and Roadmap datasets, respectively. Knowledge of the wide array of epigenomic signals provides biological insight into the state of a give cell type, but it is infeasible to experimentally characterize all possible types of epigenomic signal in the multitude of cell types in the human body. The authors present Ocelot, a machine learning approach for imputing cell-type-specific epigenomic signals along the genome.
Daily Digest
Deep learning has disrupted nearly every field of research, including those of direct importance to drug discovery, such as medicinal chemistry and pharmacology. This revolution has largely been attributed to the unprecedented advances in highly parallelizable graphics processing units (GPUs) and the development of GPU-enabled algorithms. In this Review, the authors present a comprehensive overview of historical trends and recent advances in GPU algorithms and discuss their immediate impact on the discovery of new drugs and drug targets. R is an increasingly preferred software environment for data analytics and statistical computing among scientists and practitioners. Packages markedly extend R's utility and ameliorate inefficient solutions to data science problems.
Detecting Dependencies in Sparse, Multivariate Databases Using Probabilistic Programming and Non-parametric Bayes
Saad, Feras, Mansinghka, Vikash
Datasets with hundreds of variables and many missing values are commonplace. In this setting, it is both statistically and computationally challenging to detect true predictive relationships between variables and also to suppress false positives. This paper proposes an approach that combines probabilistic programming, information theory, and non-parametric Bayes. It shows how to use Bayesian non-parametric modeling to (i) build an ensemble of joint probability models for all the variables; (ii) efficiently detect marginal independencies; and (iii) estimate the conditional mutual information between arbitrary subsets of variables, subject to a broad class of constraints. Users can access these capabilities using BayesDB, a probabilistic programming platform for probabilistic data analysis, by writing queries in a simple, SQL-like language. This paper demonstrates empirically that the method can (i) detect context-specific (in)dependencies on challenging synthetic problems and (ii) yield improved sensitivity and specificity over baselines from statistics and machine learning, on a real-world database of over 300 sparsely observed indicators of macroeconomic development and public health.
- North America > United States > Massachusetts (0.04)
- South America > Chile (0.04)
- Oceania > Australia (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
New 'machine unlearning' technique wipes out unwanted data quickly and completely
IMAGE: The novel approach to making systems forget data is called "machine unlearning " by the two researchers who are pioneering the concept. Instead of making a model directly depend on each training... view more Machine learning systems are everywhere. Computer software in these machines predict the weather, forecast earthquakes, provide recommendations based on the books and movies we like and, even, apply the brakes on our cars when we are not paying attention. To do this, computer systems are programmed to find predictive relationships calculated from the massive amounts of data we supply to them. Machine learning systems use advanced algorithms--a set of rules for solving math problems--to identify these predictive relationships using "training data."
- Information Technology > Security & Privacy (1.00)
- Law (0.71)
Multi-view predictive partitioning in high dimensions
McWilliams, Brian, Montana, Giovanni
Many modern data mining applications are concerned with the analysis of datasets in which the observations are described by paired high-dimensional vectorial representations or "views". Some typical examples can be found in web mining and genomics applications. In this article we present an algorithm for data clustering with multiple views, Multi-View Predictive Partitioning (MVPP), which relies on a novel criterion of predictive similarity between data points. We assume that, within each cluster, the dependence between multivariate views can be modelled by using a two-block partial least squares (TB-PLS) regression model, which performs dimensionality reduction and is particularly suitable for high-dimensional settings. The proposed MVPP algorithm partitions the data such that the within-cluster predictive ability between views is maximised. The proposed objective function depends on a measure of predictive influence of points under the TB-PLS model which has been derived as an extension of the PRESS statistic commonly used in ordinary least squares regression. Using simulated data, we compare the performance of MVPP to that of competing multi-view clustering methods which rely upon geometric structures of points, but ignore the predictive relationship between the two views. State-of-art results are obtained on benchmark web mining datasets.
- North America > United States > Wisconsin (0.04)
- North America > United States > Texas (0.04)
- North America > United States > Montana (0.04)
- (4 more...)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)